EN FR
EN FR


Section: Software

Main softwares

Participants : Olivier Collin [correspondant] , Dominique Lavenier, François Coste, Olivier Sallou, Romaric Sabas, Guillaume Rizk, Andres Burgos.

We highlight here 3 softwares of the team which received considerable care this year, in particular to improve their ergonomy and diffusion. In the following sections, all softwares of the team will be described, classified according to their applicative domain.

Biomaj : Data synchronization and processing workflow

BioMAJ (BIOlogie Mise A Jour) is a workflow engine dedicated to data synchronization and processing. The Software automates the update cycle and the supervision of the locally mirrored databank repository. Thanks to the funding of INRIA's ADT, the BioMAJ software has been ergonomically improved and is diffusion enhanced. It is now part of a Linux distribution (Debian-med). The tool is now used on many bioinformatics core facilities in France and Europe. It is used as an infrastructure tool but also as a key component of new resources. For example the AnnotQTL tool relies heavily on BioMAJ. Another example is popgenie, an integrative explorer of the Populus genome in Sweden has been built on top of BioMAJ.

[Web site: http://biomaj.genouest.org]

GASSST: Short reader mapper for large genomic dataset

GASSST is a short read mapper allowing very large genomic dataset to be processed. It takes as input raw data (reads) coming from next generation sequencing machines and map them over full genomes. In 2011, the GASSST software has been tuned to meet industrial requirements and transfered to the GenomeQuest Company. A specific license agreement has been set up between INRIA and GenomeQuest for integrating GASSST into the GenomeQuest NGS tool suite.

http://www.irisa.fr/symbiose/projects/gassst/

Protomata learner: fine characterization of protein families

Protomata-Learner V2.0 is a tool to infer weighted automata for the characterization of (structural or functional) families of proteins from a sample of (unaligned) sequences belonging to the family. Protomata-Learner has been completely rewritten thanks to the ADT "Suite logicielle pour la modélisation de familles protéiques par automates": based on a better formalisation and thanks to the implementation of efficient weighting techniques, this new version is significantly faster and gives better results. Special care has been given to the integration of the different programs to propose an easy-to-use suite.

Protomata-Learner has been tested and improved on real use-case thanks to collaborations established in Lepidolf and Pelican ANR projects. New scanning algorithms (Forward scores) and procedures for choosing automatically the best set of parameters have been developed. New signatures for the studied families of proteins have been established and are used for the predictions of candidates by our partners.

[Web site: http://tools.genouest.org/tools/protomata/]